Background

Paroxysmal Nocturnal Haemoglobinuria (PNH) is a rare (3.81 per 100K), treatable clonal hematopoietic stem cell (HSC) disorder characterized by intravascular hemolysis, thrombosis, and smooth muscle dystonias, with bone marrow failure occurring in some cases. Patients undergo lengthy diagnostic journeys frequently exceeding a year. Often diagnosis is made following a high morbidity/mortality event, such as a stroke. Earlier identification and treatment may improve disease burden.

Worker, et al. 2024 demonstrated feasibility with a sample of 131 patients with PNH and 74 features. We sought to replicate and extend the finding for use in a relevant clinical workflow by reducing time from symptomology to diagnosis. We evaluate whether we identify patients likely to benefit from workup for PNH because they are likely to receive an initial PNH diagnosis 3 to 12 months from initial workup.

Methods

The study used the Atropos Health Aracadia data asset, which is derived from population health management software used across the United States (67MM total and 1,208 PNH patients).

We used a set of 416K patients with a Hb (index date) between 2016 and 2023 and no prior PNH diagnosis. Cases were defined by the presence of ICD-10 D59.5 from 3 to 12 months following their index date (n=306 of 1,049 total PNH patients). Controls were randomly sampled from the total population of patients with Hb. For clinical features, the look back period was up to 24 months prior to the index date. We used a 70/30 train/test split.

Several machine learning techniques were attempted, and the results of an XGBoost model are reported here. Models were built to maximize the areas under the precision recall curve (AUPRC), clinical resonance, and ease of deployment. In addition, two physicians manually reviewed 50 individuals selected for maximum prediction discord (e.g., high predicted likelihood of PNH but no PNH diagnosis found).

Results

We identified 306 PNH patients (0.07% of sample) with an initial diagnosis of PNH 3 to 12 months after a Hb. AUPRC and AUROC were 0.09 and 0.77, respectively for a full model with 375 features and 0.08 and 0.77 for a model with 13 features. Important features included: history of anemia (aplastic and broadly defined), overall disease burden, age, haptoglobin, and having a diagnosis for myelodisplastic syndrome (MDS).

Clinical review agreed that 100% of “unlikely” PNH patients did not warrant further workup. Among the “false positives” (>34% probability of PNH, but no PNH diagnosis found in 3 to 12 months), physicians indicated 75% to 90% (two physicians) of these patients warranted further workup.

The predictive model is available at https://github.com/atroposhealth/pnh-undiagnosed.

Conclusion

The performance of this machine learning algorithm is sufficient to deploy to suggest potential workup for PNH in order to reduce diagnostic delays for patients with PNH. Setting the threshold for further workup should be tested within a local environment due to variation in lab testing (i.e., Hb) patterns.

This content is only available as a PDF.
Sign in via your Institution